258 research outputs found

    Iris: Automatic Generation of Efficient Data Layouts for High Bandwidth Utilization

    Get PDF
    Optimizing data movements is becoming one of the biggest challenges in heterogeneous computing to cope with data deluge and, consequently, big data applications. When creating specialized accelerators, modern high-level synthesis (HLS) tools are increasingly efficient in optimizing the computational aspects, but data transfers have not been adequately improved. To combat this, novel architectures such as High-Bandwidth Memory with wider data busses have been developed so that more data can be transferred in parallel. Designers must tailor their hardware/software interfaces to fully exploit the available bandwidth. HLS tools can automate this process, but the designer must follow strict coding-style rules. If the bus width is not evenly divisible by the data width (e.g., when using custom-precision data types) or if the arrays are not power-of-two length, the HLS-generated accelerator will likely not fully utilize the available bandwidth, demanding even more manual effort from the designer. We propose a methodology to automatically find and implement a data layout that, when streamed between memory and an accelerator, uses a higher percentage of the available bandwidth than a naive or HLS-optimized design. We borrow concepts from multiprocessor scheduling to achieve such high efficiency.Comment: Accepted for presentation at ASPDAC'2

    Reconfigurable Computing and Hardware/Software Codesign

    Get PDF
    none3Article ID 731830 - EditorialPLAKS T. P; SANTAMBROGIO M. D; D. SCIUTOPLAKS T., P; Santambrogio, MARCO DOMENICO; Sciuto, Donatell

    Exploring the Role of Inter-Organizational Information Systems within SMEs Aggregations

    Get PDF
    Interorganizational Information Systems (IOIS) will play a relevant role in shaping competition in the next years. Even though companies have become extremely efficient in managing information and logistics inside their boundaries, communication and coordination among partners is still far from effective. Both obsolete technologies and very scarce ICT supported interorganizational process are found in practice. In a global market where the entire supply chain is involved in company success, the proper design and implementation of an IOS is becoming mandatory. SMEs, and in particular those inside industrial aggregations, could greatly benefit from IOIS implementation, however a widely accepted IOS adoption theory is still lacking. Focusing on the description of an industrial aggregation this paper proposes a framework, its implementation and a field test on 70 companies belonging to an industrial district, to understand the relationships among aggregation’s main players. The analysis of the results proved that this approach offers useful insight for the comprehension of the aggregation and suggest its use as a pre-design IOIS tool.6-8 June 200

    Dataflow Computing with Polymorphic Registers

    Get PDF
    Heterogeneous systems are becoming increasingly popular for data processing. They improve performance of simple kernels applied to large amounts of data. However, sequential data loads may have negative impact. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high speed, parallel access to performance-critical data. Furthermore, by PRF customization, specific data path features are exposed to the programmer in a very convenient way. PRFs allow additional control over the registers dimensions, and the number of elements which can be simultaneously accessed by computational units. This paper shows how PRFs can be integrated in dataflow computational platforms. In particular, starting from an annotated source code, we present a compiler-based methodology that automatically generates the customized PRFs and the enhanced computational kernels that efficiently exploit them

    The Case for Polymorphic Registers in Dataflow Computing

    Get PDF
    Heterogeneous systems are becoming increasingly popular, delivering high performance through hardware specialization. However, sequential data accesses may have a negative impact on performance. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high-speed, parallel access to performance-critical data. This article shows how PRFs can be integrated into dataflow computational platforms. Our semi-automatic, compiler-based methodology generates customized PRFs and modifies the computational kernels to efficiently exploit them. We use a separable 2D convolution case study to evaluate the impact of memory latency and bandwidth on performance compared to a state-of-the-art NVIDIA Tesla C2050 GPU. We improve the throughput up to 56.17X and show that the PRF-augmented system outperforms the GPU for 9×9 or larger mask sizes, even in bandwidth-constrained systems

    Looking into the Crystal Ball: From Transistors to the Smart Earth

    Get PDF

    ASSURE: RTL Locking Against an Untrusted Foundry

    Get PDF
    Semiconductor design companies are integrating proprietary intellectual property (IP) blocks to build custom integrated circuits (IC) and fabricate them in a third-party foundry. Unauthorized IC copies cost these companies billions of dollars annually. While several methods have been proposed for hardware IP obfuscation, they operate on the gate-level netlist, i.e., after the synthesis tools embed the semantic information into the netlist. We propose ASSURE to protect hardware IP modules operating on the register-transfer level (RTL) description. The RTL approach has three advantages: (i) it allows designers to obfuscate IP cores generated with many different methods (e.g., hardware generators, high-level synthesis tools, and pre-existing IPs). (ii) it obfuscates the semantics of an IC before logic synthesis; (iii) it does not require modifications to EDA flows. We perform a cost and security assessment of ASSURE.Comment: Submitted to IEEE Transactions on VLSI Systems on 11-Oct-2020, 28-Jan-202
    corecore